Lrn
=================

执行局部响应归一化 (Local Response Normalization)。

.. math::

    \text{output}_{i,j} = \text{input}_{i,j} \times \left( \text{bias} + \alpha \sum_{k=\max(0, j-d)}^{\min(C-1, j+d)} (\text{input}_{i,k})^2 \right)^{-\beta}

其中，i 遍历每个 out_size 维度，j 表示当前通道，d 是 depth_radius，C 是总通道数 channel， :math:\alpha 和 :math:\beta 是缩放因子。

输入：
    - **input** - 输入张量数据地址，其逻辑布局为 (out_size, channel)。
    - **out_size** - 空间/批处理维度的乘积 (例如 N*H*W)。
    - **channel** - 通道数 (C)。
    - **depth_radius** - 归一化窗口的半径。总的窗口大小为 2 * depth_radius + 1。
    - **alpha** - 缩放因子 :math:\alpha。
    - **beta** - 指数 :math:\beta。
    - **bias** - 偏置项。
    - **core_mask** - 核掩码（仅共享存储版本需要）。

输出：
    - **Output** - 计算结果地址。

支持平台：
    ``FT78NE``
    ``MT7004``

.. note::
    - FT78NE 支持fp32
    - MT7004 支持fp16, fp32

**共享存储版本:**

.. c:function:: void hp_lrn_s(half* input, half* output, int out_size, int channel, int depth_radius, float alpha, float beta, float bias, int core_mask)
.. c:function:: void fp_lrn_s(float* input, float* output, int out_size, int channel, int depth_radius, float alpha, float beta, float bias, int core_mask)

**C调用示例：**

.. code-block:: c
    :linenos:
    :emphasize-lines: 14

    //FT78NE示例
    #include <stdio.h>
    #include <lrn.h> // 假设头文件名为 lrn.h
    int main(int argc, char* argv[]) {
        float *input = (float *)0xA0000000;   //input在DDR空间
        float *output = (float *)0xC0000000;
        int out_size = 256; // 例如 N*H*W
        int channel = 96;
        int depth_radius = 5;
        float alpha = 0.0001f;
        float beta = 0.75f;
        float bias = 1.0f;
        int core_mask = 0xff;
        fp_lrn_s(input, output, out_size, channel, depth_radius, alpha, beta, bias, core_mask);
        return 0;
    }

**私有存储版本:**

.. c:function:: void hp_lrn_p(half* input, half* output, int out_size, int channel, int depth_radius, float alpha, float beta, float bias)
.. c:function:: void fp_lrn_p(float* input, float* output, int out_size, int channel, int depth_radius, float alpha, float beta, float bias)

    
**C调用示例：**

.. code-block:: c
    :linenos:
    :emphasize-lines: 13

    //FT78NE示例
    #include <stdio.h>
    #include <lrn.h> // 假设头文件名为 lrn.h
    int main(int argc, char* argv[]) {
        float *input = (float *)0x10000000;   //input在L2空间
        float *output = (float *)0x10010000;
        int out_size = 256; // 例如 N*H*W
        int channel = 96;
        int depth_radius = 5;
        float alpha = 0.0001f;
        float beta = 0.75f;
        float bias = 1.0f;
        fp_lrn_p(input, output, out_size, channel, depth_radius, alpha, beta, bias);
        return 0;
    }